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Abstract 

Pedestrian  detection  has  been  an  active 
topic  for  several  years.  Many  types  of 
sensors  and  algorithms  have  been 
explored  with  varying  levels  of  success. 
Currently,  the  pedestrian  detection 
program  within  the  Intelligent  System 
TARDEC  Technology  area  concentrates 
on  stereo  vision  systems:  stereo  gray 
scale,  stereo  color,  and  stereo  infrared. 
Both  human  detection  from  a  single 
framed,  stereo-paired  image  and  tracking 
using  a  sequence  of  stereo-paired  images 
are  investigated.  This  paper  will  discuss 
the  current  and  future  state  of  these 
activities. 


1.  Introduction 

Pedestrian  detection  is  an  important  field 
of  research.  Autonomous  and  semi- 
autonomous  vehicles  need  to  identify 
people  while  traversing  through  the 
terrain  in  order  to  take  appropriate 
actions  to  avoid  them.  Driver  awareness 
systems,  like  those  proposed  by  the 
Department  of  Transportation  Intelligent 
Transportation  Systems  Division,  need 
the  ability  to  alert  drivers  of  potential 
problems  when  driving  through  urban 
areas.  Additionally,  the  Department  of 
Defense  needs  pedestrian  detection  for 


path  following  and  mule  operations  on 
robotic  vehicles. 

In  this  paper,  the  focus  is  on  pedestrian 
detection  and  its  use  in  robotic  vehicles. 
Section  2  discusses  the  types  of 
pedestrian  detection  systems  that  are 
commonly  found  in  most  literature. 
Section  3  discusses  the  specifics  of 
vision  based  pedestrian  detection. 
Vision  based  pedestrian  detection  is  the 
focus  of  the  work  performed  at 
Intelligent  Systems  Human  Intent  and 
Analysis  Lab  (HID  Lab).  The  HID  Lab 
projects  are  discussed  in  section  4. 
Section  5  concludes  the  discussion  and 
section  6  lists  references. 


2.  Pedestrian  Detection  Systems 

The  research  area  of  pedestrian  detection 
is  very  large.  There  are  many  different 
approaches  to  this  problem.  Some  use 
LADAR  or  laser  scanners  to  retrieve  a 
3D  map  of  the  terrain  and  detect 
pedestrians  [1,2, 3, 4],  another  uses 
ultrasonic  sensors  to  determine  the 
reflection  of  pedestrians  [5].  Radar  is 
also  popular  for  detecting  pedestrians 
similar  to  ultrasonic  sensors;  by 
measuring  the  reflection  of  possible 
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targets  and  determining  if  they  are 
pedestrians  or  not  [6,7]. 

A  natural  choice  for  a  sensor  is  vision 
because  it  is  based  on  how  people 
perceive  pedestrians.  Within  the  area  of 
computer  vision  there  are  infrared 
vision,  monocular  vision,  and  stereo 
vision  processes.  Each  vision  system 
has  its  own  advantages  and 
disadvantages.  Infrared  systems 

[8,9,10,11]  are  not  as  sensitive  to 
lighting  conditions  when  compared  to 
other  visual  sensors.  However,  they  are 
more  expensive  and  the  image  quality  is 
not  as  good.  Monocular  vision  systems 
[12,13,21]  are  cheap  and  require  lower 
processing  power,  but  they  perform 
poorly  at  providing  range  data  and  are 
more  sensitive  to  color  and  lighting. 
Stereo  vision  systems 

[14,15,16,17,18,19,20]  have  the 
advantage  of  being  able  to  view  potential 
objects  from  two  points  of  view.  They 
are  also  used  to  detect  dispersion  (or 
depth)  of  objects.  The  drawbacks  of 
stereo  vision  systems  are  that  they 
require  more  processing  time  and  are 
sensitive  to  color  and  lighting 
conditions. 

3.  Vision  Based  Pedestrian  Detection 
Systems 

Vision  based  pedestrian  detection  is  a 
difficult  problem  because  of  processing 
speed,  robustness  of  vision  sensor’s 
algorithms,  and  a  lack  of  maturity  in 
computational  intelligent  systems  to 
recognize  everyday  object.  For  the 
Department  of  Defense,  specifically 
TARDEC,  vision  based  detection  is 
important  for  non-evasive  pedestrian 
detection  systems  in  the  areas  of  path 
following,  mule  operations,  surveillance, 
and  driver  awareness.  The  problem 


increases  in  difficulty  when  considering 
the  movement  of  the  sensors, 
uncontrolled  outdoor  environments,  and 
variations  in  pedestrian’s  appearance  and 
pose.  There  are  many  types  of 
algorithms  that  try  to  address  these 
problems. 

Motion  based  systems  are  used  to  detect 
pedestrians  from  image  sequences.  They 
take  into  account  the  temporal 
information  to  detect  periodic  features  of 
human  movement  [15,17,18],  This 
technique  reduces  the  number  of  false 
positives  from  other  methods,  but 
requires  a  lead  time  of  images,  which 
causes  a  delay  in  the  detection.  Another 
drawback  is  that  it  is  unlikely  to  detect 
people  standing  or  making  unusual 
movements  (like  jumping). 

Template  based  systems  [14,16,20]  can 
be  used  on  single  frames  so  they  do  not 
require  a  lead  time  or  movement  of  a 
pedestrian.  These  systems  match  pre¬ 
defined  pedestrian  typical  shapes 
(generally)  with  the  image  to  recognize 
people  in  the  picture.  The  problem  with 
most  of  these  procedures  is  that  they 
have  difficulty  in  detecting  variations  in 
a  pedestrian’s  appearance  or  pose. 

One  other  common  method  is  to  detect 
body  parts  of  person,  then  put  them 
together  in  a  logical  form  to  determine 
the  confidence  of  the  target  being  a 
person  or  not  [19,21].  This  requires  a  lot 
of  processing,  but  is  good  at  detecting 
occluded  people.  However,  many  false 
positives  can  occur  due  to  matching 
potential  body  parts  of  things  unrelated. 

4.  Intelligent  System’s  Pedestrian 
Detection  Program 
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The  pedestrian  detection  programs  in  the 
Intelligent  Systems  TARDEC 
Technology  Area  look  at  several 
different  types  of  algorithms  to  detect 
people.  The  main  focus  is  on  visual  and 
infrared  sensor  based  pedestrian 
detection  systems.  Each  algorithm  is 
modeled  for  the  environment  it  will  be 
used  in,  from  driver  awareness  systems 
to  pedestrian  following  autonomous 
systems.  Below  is  a  description  of  the 
types  of  pedestrian  detection  projects 
currently  being  researched  by  the  EIID 
Lab. 

Human  Localization  Using  Gray  Scale 
Stereo  Imagery. 

This  programs  main  purpose  is  two-fold. 
First  it  is  used  to  detect  humans  from  a 
single,  stereo-paired  image  and  alert  the 
driver  of  the  person’s  location  with 
respect  to  the  cameras;  second  it 
provides  an  autonomous  robotic  system 
the  location  of  people  in  the  scene.  It 
uses  gray  scale  intensity  mapping  with 
depth  information  from  the  stereo 
cameras  to  single  out  possible  people. 
Then  it  removes  most  all  false  positives 
by  doing  a  head-shoulders  template 
check  of  the  candidates.  Finally,  it  sends 
(or  displays)  the  pedestrian  location. 
The  head  shoulders  check  is  the  only 
template  matching  done  on  this 
algorithm  and  it  is  used  mainly  to 
decrease  the  false  positives. 

Human  Localization  Using  Infrared 
Stereo  Imagery. 

This  program  will  be  used  in  conjunction 
with  the  Human  Localization  using  gray 
scale  imagery  to  improve  the 
performance  of  pedestrian  detection 
systems.  Alone,  it  works  well  in  day  or 
night  as  long  as  the  outside  temperature 
is  below  85  degrees.  First,  it  views  the 
higher  intensity  areas  and  computes  the 


distance  of  the  areas  in  both  left  and 
right  camera  views.  Any  non-matched 
items  are  removed.  Then  it  populates 
the  remaining  regions  based  on  distance 
from  the  camera  and,  correlates  this 
information  with  typical  human 
length/width  ratios.  Final  processing 
involves  a  head-shoulders  template 
match  in  the  regions  of  interest  and 
removes  candidates  without  one. 

Combining  the  infrared  stereo  imagery 
and  the  gray  scale  stereo  imagery  will 
provide  a  pedestrian  detection  system 
that  relies  on  several  types  of  data.  The 
information  from  each  of  the  processes 
will  be  combined  intelligently  to 
determine  locations  of  humans.  The 
goal  is  to  choose  the  best  pieces  of 
information  from  both  gray  scale  and 
infrared  processing  algorithms  based  on 
the  vehicles  current  environmental 
conditions. 

Color  Stereo  Pedestrian  Detection. 

This  project’s  main  goal  is  to  increase 
the  current  pedestrian  detection  systems 
by  adding  intelligent  techniques  to  color 
processing.  The  color  image  is 
processed  to  cluster  different  shades  of 
colors  and  distances  from  the  camera. 
The  image  is  then  matched  against 
templates  for  body  parts  (arms,  legs, 
torsos,  heads,  etc.).  Each  possible  body 
part  is  identified  and  a  location  from 
each  other  is  used  to  determine  if  it  is  a 
feasible  person  (as  well  as  which  person 
each  part  belongs  to).  This  will 
eliminate  problems  with  occlusion  of 
people  in  a  scene. 

Pedestrian  Following. 

The  main  purpose  of  this  project  is  to 
create  an  algorithm  to  track  a  particular 
person  using  color  stereo  cameras.  It 
will  be  implemented  on  several  different 
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robotic  platforms.  The  system  will  be 
operator  initiated  through  the  selection 
of  a  specific  person  to  follow  by  clicking 
on  the  person  through  the  human  robot 
interface  (from  an  image  provided  by 
one  of  the  two  cameras  attached  to  the 
robotic  vehicle).  Next,  the  pedestrian  is 
segmented  from  the  image  and  blob 
clustering  is  performed  based  on  color 
and  disparity.  This  processed  image  is 
used  as  a  template  for  the  next  frame. 
As  the  person  starts  to  move,  the  region 
of  movement  from  one  frame  to  the  next 
is  calculated  and  the  segmented  image 
from  the  previous  frame  is  used  as  a 
template  to  find  the  location  of  the 
pedestrian  and  matched.  A  distance 
from  the  cameras  (0,0,0  world 
coordinates)  to  the  pedestrian  (x,y,z 
world  coordinates)  is  computed  and  sent 
to  the  mobility  process  of  the  robot.  The 
template  matching  is  dynamic  since  each 
template  changes  from  each  frame.  The 
computed  location  can  be  updated  every 
second  or  every  minute,  based  on  the 
type  of  following  preferred.  It  will  also 
be  designed  to  work  on  any  GPS 
waypoint  robotic  vehicle.  The 
waypoints  will  be  computed  by  the 
calculations  of  the  pedestrian  locations. 

Fused  Infrared  and  Gray  Scale 
Pedestrian  Detection/Enhancement. 

This  project  is  a  joint  effort  between  the 
HID  Lab  and  the  Perception  Lab.  The 
Perception  Lab’s  effort  is  focused  on 
human  perception  studies  on  fusion 
techniques  between  gray  scale  and 
infrared  imagery.  The  HID  Lab’s  focus 
is  on  using  the  same  fusion  techniques 
but  applying  machine  intelligence  to  the 
problem.  The  goal  of  this  project  is  to 
have  the  computer  detect  pedestrians 
with  comparable  results  recorded  by  the 
human  studies.  An  investigation  into 


computer  enhancement  of  the  fused 
imagery  will  also  be  performed. 

5.  Conclusion 

Within  the  next  decade  robotic  vehicles 
will  be  introduced  into  the  battlefield  in 
large  numbers  as  a  result  of  FCS.  This 
will  dictate  a  change  in  doctrine  on  how 
the  Army  fights.  Robots  and  soldiers 
will  be  in  the  field  together  and  need  to 
coexist  and  function  in  teams.  It  is 
imperative  that  robotics  systems  have  an 
error  free  pedestrian  detection  system 
available. 

TARDEC’s  Intelligent  System’s  Human 
Intent  and  Detection  (HID)  Lab  is 
focused  on  providing  a  quality 
pedestrian  detection  system  based  on  the 
type  of  environment  it  will  be  used  in. 
The  pedestrian  detection  efforts  shown 
throughout  will  feed  the  Intelligence 
System’s  Army  Technology  Objectives 
(ATOs)  such  as  Armed  Robotic  Vehicle 
(ARV)  Robotic  Technologies  (ART)  and 
Human  Robot  Interface  (HRI).  These 
ATOs  provide  technologies  that 
transition  to  FCS  platforms. 
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